Conversation
생년월일을 비자 발급일로 파싱하던 오류를 수정함
- 기존에 JPG밖에 지원을 안 했어서 Naver Clova OCR API가 지원하는 JPG, JPEG, PNG에 대해 적용될 수 있도록 수정함. - OCR 수행 중 내부 로직에서 예외 발생 시 이를 응답으로 명확하게 내려주도록 함
📝 WalkthroughWalkthrough온보딩용 비자·여권 OCR API를 추가하고, 전역 문서 처리 서비스로 PDF/이미지 텍스트 추출 및 OCR 파이프라인을 도입·리팩토링했으며, 비자/여권 전용 파서·유틸 및 관련 예외/DTO들을 추가했습니다. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Controller as MemberController
participant Service as MemberService
participant DocSvc as DocumentProcessingService
participant Clova as ClovaOcrService
participant Parser as Visa/Passport Parser
Client->>Controller: POST /onboard/ocr/{visa|passport} (MultipartFile)
Controller->>Service: get{Visa|Passport}Ocr(file)
Service->>DocSvc: extractText(file)
alt PDF 페이지에 텍스트 없음
DocSvc->>Clova: doOcr(page image)
Clova-->>DocSvc: ocrText
else 이미지 파일
DocSvc->>Clova: doOcr(image)
Clova-->>DocSvc: ocrText
end
DocSvc-->>Service: rawText
Service->>Parser: parse(rawText)
Parser-->>Service: VisaInfo / PassportInfo
Service-->>Controller: Ocr{Visa|Passport}Response
Controller-->>Client: 200 OK (BaseResponse)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
시 🐰
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (5)
src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java (2)
13-13: 인스턴스 필드에 상수 명명 규칙 사용
ISO3_MAP은static final이 아닌 인스턴스 필드이므로, Java 명명 규칙에 따라 camelCase(iso3Map)를 사용하는 것이 적절합니다.♻️ 제안된 수정
- private final Map<String, Country> ISO3_MAP; + private final Map<String, Country> iso3Map;이후 사용처도 함께 변경해야 합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java` at line 13, The field ISO3_MAP in CountryResolver is an instance field but uses constant-style naming; rename ISO3_MAP to iso3Map to follow Java camelCase conventions and update all references/usages of the field (constructors, methods, tests) accordingly; ensure access modifiers and behavior remain unchanged and run tests/compilation to confirm no remaining references to ISO3_MAP.
31-32: 예외 무시(silent swallow)로 인한 디버깅 어려움빈
catch블록은 국가 매핑 실패 원인을 파악하기 어렵게 만듭니다. 최소한 디버그 레벨 로깅을 추가하면 운영 환경에서 매핑 누락 문제를 진단하는 데 도움이 됩니다.♻️ 제안된 수정
클래스에
@Slf4j추가 후:} catch (Exception ignored) { + log.debug("Failed to resolve ISO3 for country: {}", country.getCountryName()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java` around lines 31 - 32, The empty catch in CountryResolver silently swallows exceptions making mapping failures hard to debug; add Lombok `@Slf4j` to the CountryResolver class and replace the empty catch(Exception ignored) block with a log call (e.g., log.debug or log.warn) that includes context and the exception (for example: "Failed to resolve country for input {}" plus the exception) so failures are recorded for diagnostics while preserving existing behavior.src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java (1)
10-12: 들여쓰기 불일치두 번째 생성자의
super호출이 2칸 들여쓰기를 사용하고 있어 첫 번째 생성자(4칸)와 일치하지 않습니다.♻️ 제안된 수정
public DocumentException(DocumentErrorCode errorCode, String message) { - super(errorCode, message); + super(errorCode, message); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java` around lines 10 - 12, The second constructor in DocumentException (public DocumentException(DocumentErrorCode errorCode, String message)) has its super(...) call indented with 2 spaces while the first constructor uses 4 spaces; update the indentation of the super(errorCode, message); line in that constructor to match the 4-space indentation style used for the first constructor so formatting is consistent across both constructors.src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java (1)
164-178: 신규 OCR API에도 예외 응답 명세를 추가해 주세요.다른 엔드포인트와 달리
@CustomExceptionDescription이 없어 Swagger 에러 계약이 비어 있습니다. 클라이언트 연동 안정성을 위해 동일하게 명세하는 것이 좋습니다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java` around lines 164 - 178, Add the same `@CustomExceptionDescription` metadata to the two OCR endpoints so their Swagger error contracts match the rest of the controller: annotate the getVisaInfo and getPassportInfo methods in MemberController with the same `@CustomExceptionDescription` annotation (and parameters/codes) used on other endpoints in this class so the error responses are documented consistently for these POST /onboard/ocr/visa and POST /onboard/ocr/passport handlers.src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java (1)
79-81: 정규식은 매 호출마다 컴파일하지 말고 상수로 재사용하세요.
collectCandidates에서Pattern.compile을 반복 호출하고 있어 불필요한 오버헤드가 있습니다. 클래스 상수로 올려 재사용하는 편이 낫습니다.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java` around lines 79 - 81, In collectCandidates, avoid compiling the regex on every call; extract Pattern.compile("[A-Z0-9<]{20,}") into a reusable static final Pattern (e.g., MRZ_PATTERN) at class level in PassportOcrParser and replace the local Pattern.compile(...) usage with MRZ_PATTERN.matcher(upper) so the same compiled Pattern is reused across invocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java`:
- Around line 167-177: Remove the checked IOException from the controller method
signatures and ensure OCR failures are converted to a domain/runtime exception
and mapped to a standard error response: stop declaring "throws IOException" on
getVisaInfo/getPassportInfo and instead handle IO errors coming from
memberService.getVisaOcr(file) and memberService.getPassportOcr(file) by either
(a) catching IOException in the controller and rethrowing a custom unchecked
OcrProcessingException (or returning a standardized BaseResponse error), or (b)
converting IOException inside the service into a domain exception; add an
OcrProcessingException (or reuse a suitable domain exception) and
register/extend the global exception handler (e.g., GlobalExceptionHandler) to
translate that exception into the unified error response and proper HTTP status.
Ensure you reference memberService.getVisaOcr and memberService.getPassportOcr
when making the change.
In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java`:
- Around line 184-195: The getVisaOcr and getPassportOcr methods currently
propagate IOException from DocumentProcessingService.extractText; instead catch
IOException inside these methods and convert it to a domain/runtime exception
(e.g., DocumentProcessingException or a checked-to-unchecked wrapper used
elsewhere like ResumeContextService) so the service layer does not expose
checked IO exceptions; update getVisaOcr and getPassportOcr to try-catch around
extractText, log or include the cause, and throw the chosen domain exception
while preserving the original exception as the cause.
In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 27-37: The PassportOcrParser currently logs sensitive PII (rawText
and MRZ contents) via log.info using variables rawText and mrz (from
DocumentTextUtils.normalize and extractMrz); remove or replace these logs so PII
is not written to INFO: either delete the log.info calls referencing rawText and
mrz or log non-sensitive metadata only (e.g., masked values or presence/length
flags) before returning PassportInfo, ensuring no full name, DOB, document ID,
or MRZ lines are emitted.
In `@src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java`:
- Around line 40-42: In VisaOcrParser.parse, stop logging the full normalized
OCR string (DocumentTextUtils.normalize result) to avoid leaking PII; instead
log only non-identifying metadata such as the normalized text length and a
parse-success indicator (or a stable hash/fingerprint if you need correlation)
using the existing logger (log.info) and keep any detailed errors at debug level
without including raw text; update references in parse to remove the rawText log
and replace it with e.g. length/hashed summary and success/failure information.
In
`@src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java`:
- Around line 46-47: The image/PDF content-type checks in
DocumentProcessingService are too strict and fail when Content-Type is missing
or set to application/octet-stream; update the logic so that when handling
uploads (methods calling isImage(...) / extractPagesFromImage(...) and the PDF
branch using isPdf(...) / extractPagesFromPdf(...)) you fall back to the
uploaded filename extension (use the original filename from the MultipartFile or
equivalent) — e.g., enhance or add a helper like isImageByExtension(filename)
and isPdfByExtension(filename) and change the branches to use: if
(isImage(contentType) || isImageByExtension(filename)) return
extractPagesFromImage(file); and similarly for PDF (lines around
extractPagesFromPdf) so JPEG/JPG/PNG files with missing MIME types are correctly
recognized.
In `@src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java`:
- Around line 49-51: DocumentDateUtils의 2자리 연도 처리에서 고정 컷오프(>=50)를 사용해 미래 생년월일이
만들어지는 문제가 있습니다; 대신 parse/convert 로직(해당 메서드에서 year += (year >= 50 ? 1900 : 2000);
및 return LocalDate.of(...))을 변경해 먼저 2자리 연도를 1900/2000 기준으로 변환한 뒤(LocalDate 생성
시), 생성된 날짜가 오늘(today)보다 미래라면 year -= 100 하여 100년 뒤로 보정하는 동적 피벗 방식을 사용하도록 수정하세요.
Ensure the adjustment happens before returning the LocalDate and keep validation
of month/day as-is.
- Around line 29-31: The code in DocumentDateUtils force-uppercases the input
(the variable `value` via `.toUpperCase(Locale.ROOT)`), which breaks parsing of
English month names for the `DateTimeFormatter` patterns `d MMM yyyy`/`dd MMM
yyyy`; remove the `.toUpperCase(Locale.ROOT)` call from the `value`
normalization or alternatively make the formatters case-insensitive by using
`DateTimeFormatterBuilder.parseCaseInsensitive()` when building the
`DateTimeFormatter` instances used by the parsing methods (keep references to
the `value` variable and the formatter-building logic in DocumentDateUtils to
locate where to change).
---
Nitpick comments:
In
`@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java`:
- Around line 164-178: Add the same `@CustomExceptionDescription` metadata to the
two OCR endpoints so their Swagger error contracts match the rest of the
controller: annotate the getVisaInfo and getPassportInfo methods in
MemberController with the same `@CustomExceptionDescription` annotation (and
parameters/codes) used on other endpoints in this class so the error responses
are documented consistently for these POST /onboard/ocr/visa and POST
/onboard/ocr/passport handlers.
In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java`:
- Line 13: The field ISO3_MAP in CountryResolver is an instance field but uses
constant-style naming; rename ISO3_MAP to iso3Map to follow Java camelCase
conventions and update all references/usages of the field (constructors,
methods, tests) accordingly; ensure access modifiers and behavior remain
unchanged and run tests/compilation to confirm no remaining references to
ISO3_MAP.
- Around line 31-32: The empty catch in CountryResolver silently swallows
exceptions making mapping failures hard to debug; add Lombok `@Slf4j` to the
CountryResolver class and replace the empty catch(Exception ignored) block with
a log call (e.g., log.debug or log.warn) that includes context and the exception
(for example: "Failed to resolve country for input {}" plus the exception) so
failures are recorded for diagnostics while preserving existing behavior.
In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 79-81: In collectCandidates, avoid compiling the regex on every
call; extract Pattern.compile("[A-Z0-9<]{20,}") into a reusable static final
Pattern (e.g., MRZ_PATTERN) at class level in PassportOcrParser and replace the
local Pattern.compile(...) usage with MRZ_PATTERN.matcher(upper) so the same
compiled Pattern is reused across invocations.
In
`@src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java`:
- Around line 10-12: The second constructor in DocumentException (public
DocumentException(DocumentErrorCode errorCode, String message)) has its
super(...) call indented with 2 spaces while the first constructor uses 4
spaces; update the indentation of the super(errorCode, message); line in that
constructor to match the 4-space indentation style used for the first
constructor so formatting is consistent across both constructors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 17a9fc2f-77d4-4c04-9902-2e7a0731ad3b
📒 Files selected for processing (21)
src/main/java/org/sopt/kareer/domain/jobposting/util/ResumeContextService.javasrc/main/java/org/sopt/kareer/domain/member/controller/MemberController.javasrc/main/java/org/sopt/kareer/domain/member/dto/response/OcrPassportResponse.javasrc/main/java/org/sopt/kareer/domain/member/dto/response/OcrVisaResponse.javasrc/main/java/org/sopt/kareer/domain/member/entity/enums/VisaType.javasrc/main/java/org/sopt/kareer/domain/member/service/MemberService.javasrc/main/java/org/sopt/kareer/domain/member/util/CountryResolver.javasrc/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.javasrc/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.javasrc/main/java/org/sopt/kareer/global/document/dto/response/PageText.javasrc/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.javasrc/main/java/org/sopt/kareer/global/document/exception/DocumentException.javasrc/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.javasrc/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.javasrc/main/java/org/sopt/kareer/global/document/util/DocumentTextUtils.javasrc/main/java/org/sopt/kareer/global/external/ai/service/RagEmbeddingService.javasrc/main/java/org/sopt/kareer/global/external/clova/dto/response/PageText.javasrc/main/java/org/sopt/kareer/global/external/clova/exception/ClovaErrorCode.javasrc/main/java/org/sopt/kareer/global/external/clova/exception/ClovaException.javasrc/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.javasrc/main/java/org/sopt/kareer/global/external/clova/service/DocumentProcessingService.java
💤 Files with no reviewable changes (2)
- src/main/java/org/sopt/kareer/global/external/clova/dto/response/PageText.java
- src/main/java/org/sopt/kareer/global/external/clova/service/DocumentProcessingService.java
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java (1)
150-170: 불필요한 try-catch 블록입니다.
CountryResolver.resolveIso3()는 예외를 던지지 않고 알 수 없는 코드에 대해null을 반환합니다(CountryResolver.java:47-50참조). try-catch 래퍼는 제거해도 됩니다.♻️ 리팩토링 제안
private Country extractCountry(String line1, String line2) { - try { - if (line1 != null) { - String fitted = normalizeLineLength(line1); - String code = substringSafely(fitted, 2, 5); - Country c = countryResolver.resolveIso3(code); - if (c != null) return c; - } - - if (line2 != null) { - String fitted = normalizeLineLength(line2); - String code = substringSafely(fitted, 10, 13); - return countryResolver.resolveIso3(code); - } - - return null; - - } catch (Exception e) { - return null; + if (line1 != null) { + String fitted = normalizeLineLength(line1); + String code = substringSafely(fitted, 2, 5); + Country c = countryResolver.resolveIso3(code); + if (c != null) return c; } + + if (line2 != null) { + String fitted = normalizeLineLength(line2); + String code = substringSafely(fitted, 10, 13); + return countryResolver.resolveIso3(code); + } + + return null; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java` around lines 150 - 170, The try-catch in extractCountry is unnecessary because CountryResolver.resolveIso3 does not throw; remove the surrounding try-catch block in the extractCountry method and keep the existing logic: check line1 -> normalizeLineLength(line1) -> substringSafely(...,2,5) -> countryResolver.resolveIso3(code) and if non-null return it; otherwise check line2 using normalizeLineLength(line2) and substringSafely(...,10,13) and return countryResolver.resolveIso3(code); preserve the final return null path and existing helper calls (normalizeLineLength, substringSafely, countryResolver.resolveIso3) but eliminate the exception handling wrapper.src/main/java/org/sopt/kareer/domain/member/service/MemberService.java (1)
184-197: 이전 리뷰에서 지적된IOException노출 문제가 해결되었습니다.
DocumentProcessingService.extractText()의IOException을 도메인 예외(DocumentException)로 적절히 변환하고 있습니다.다만, 현재
DocumentException생성자는(ErrorCode errorCode, String messageDetail)형태만 지원하므로, 디버깅을 위해 원본 예외 정보를 messageDetail 파라미터로 전달하는 것을 권장합니다:♻️ 원본 예외 정보를 messageDetail로 포함
} catch(Exception e) { throw new DocumentException( - DocumentErrorCode.OCR_PROCESSING_FAILED + DocumentErrorCode.OCR_PROCESSING_FAILED, + e.getMessage() ); }또는 더 자세한 정보가 필요하면:
} catch(Exception e) { throw new DocumentException( - DocumentErrorCode.OCR_PROCESSING_FAILED + DocumentErrorCode.OCR_PROCESSING_FAILED, + e.toString() ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java` around lines 184 - 197, In getVisaOcr in MemberService, wrap and propagate the original exception details into the DocumentException so debugging info isn't lost: when catching generic Exception (or specifically IOException rethrown from DocumentProcessingService.extractText), call the DocumentException constructor that accepts (ErrorCode, String messageDetail) and pass the original exception's message (e.getMessage()) or a combined message with context (e.g. "OCR failed: " + e.getMessage()), and optionally include exception class name; keep the thrown type DocumentException and do this inside the catch(Exception e) block referenced in getVisaOcr.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@src/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java`:
- Around line 61-71: The ClovaOcrService builds a ClovaException including raw
provider response (errorBody) which then gets returned to clients; change the
code so the detailed provider body is only logged via log.error and not embedded
in the ClovaException message — create the ClovaException using safe metadata
(e.g., "CLOVA OCR error. status=<status>") and
ClovaErrorCode.EXTRACT_IMAGE_FAILED, while keeping the full errorBody and any
exception details in the log.debug/error call; apply the same change to the
second occurrence that constructs a ClovaException around lines handling
clientResponse/bodyToMono to ensure external messages are generalized.
- Around line 77-85: The current check in ClovaOcrService that returns "" for
response==null, images==null/empty, or fields==null masks provider errors;
change the logic so only fields.isEmpty() is treated as a valid empty OCR result
(return ""), while response==null, response.images()==null,
response.images().isEmpty(), or fields==null should throw an exception (e.g., an
ExternalServiceException or propagate a runtime exception) so
DocumentProcessingService sees provider failures instead of silently treating
them as empty pages; adjust the checks around response, images(), and fields()
accordingly in the method that reads the OCR response.
---
Nitpick comments:
In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java`:
- Around line 184-197: In getVisaOcr in MemberService, wrap and propagate the
original exception details into the DocumentException so debugging info isn't
lost: when catching generic Exception (or specifically IOException rethrown from
DocumentProcessingService.extractText), call the DocumentException constructor
that accepts (ErrorCode, String messageDetail) and pass the original exception's
message (e.getMessage()) or a combined message with context (e.g. "OCR failed: "
+ e.getMessage()), and optionally include exception class name; keep the thrown
type DocumentException and do this inside the catch(Exception e) block
referenced in getVisaOcr.
In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 150-170: The try-catch in extractCountry is unnecessary because
CountryResolver.resolveIso3 does not throw; remove the surrounding try-catch
block in the extractCountry method and keep the existing logic: check line1 ->
normalizeLineLength(line1) -> substringSafely(...,2,5) ->
countryResolver.resolveIso3(code) and if non-null return it; otherwise check
line2 using normalizeLineLength(line2) and substringSafely(...,10,13) and return
countryResolver.resolveIso3(code); preserve the final return null path and
existing helper calls (normalizeLineLength, substringSafely,
countryResolver.resolveIso3) but eliminate the exception handling wrapper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a91c5ffc-c9bf-470c-b80f-ef55870d0fde
📒 Files selected for processing (8)
src/main/java/org/sopt/kareer/domain/member/controller/MemberController.javasrc/main/java/org/sopt/kareer/domain/member/service/MemberService.javasrc/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.javasrc/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.javasrc/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.javasrc/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.javasrc/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.javasrc/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java
🚧 Files skipped from review as they are similar to previous changes (3)
- src/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.java
- src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java
- src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java
Related issue 🛠
Work Description 📝
Country에 담긴 값들을 매핑해주도록 했습니다.ScreenShots 📷
To Reviewers 📢
Summary by CodeRabbit
릴리스 노트
새로운 기능
리팩토링
버그 수정